Branch-and-Bound Reconstruction of Ancestral Sequences

نویسندگان

  • Nir Friedman
  • Itsik Pe’er
  • Tal Pupko
چکیده

The problem of ancestral sequence reconstruction is the statistical inference of sequences that correspond to internal nodes in a phylgenetic tree [1]. Joint reconstruction is the task of seeking the most likely set of ancestral states corresponding to all the ancestral taxa, while marginal reconstruction aims at inferring the sequence in a specific internal node. In simple probabilistic models of evolution, both tasks can be performed efficiently using dynamic programing [3, 1]. The situation is more complicated in more detailed models of evolution, such as models with among-site-rate-variation (ASRV). In these models, one assume that the rate of evolution can vary among different sites. This is modeled by introducing a latent quantity that models the rate at each site. Maximum likelihood (ML) models incorporating ASRV are statistically superior to those assuming among site rate homogeneity [2]. For example, it was shown that strong support for rodent nonmonophyly results from systematic error associated with the oversimplified assumption of homogeneity [4]. Currently, no efficient algorithm exists for joint ancestral reconstruction in ASRV models. In particular, dynamic programing approaches fail in these models. In this work we devise a branch-and-bound algorithm for joint ancestral reconstruction under ASRV and show that it can find the most likely reconstruction for large phylogenies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

2 0 Ja n 20 10 On the inference of large phylogenies with long branches : How long is too long ? ∗

The accurate reconstruction of phylogenies from short molecular sequences is an important problem in computational biology. Recent work has highlighted deep connections between sequence-length requirements for highprobability phylogeny reconstruction and the related problem of the estimation of ancestral sequences. In [Daskalakis et al.’09], building on the work of [Mossel’04], a tight sequence...

متن کامل

On the inference of large phylogenies with long branches: How long is too long?

The accurate reconstruction of phylogenies from short molecular sequences is an important problem in computational biology. Recent work has highlighted deep connections between sequence-length requirements for high-probability phylogeny reconstruction and the related problem of the estimation of ancestral sequences. In Daskalakis et al. (in Probab. Theory Relat. Fields 2010), building on the wo...

متن کامل

A branch-and-bound algorithm for the inference of ancestral amino-acid sequences when the replacement rate varies among sites: Application to the evolution of five gene families

MOTIVATION We developed an algorithm to reconstruct ancestral sequences, taking into account the rate variation among sites of the protein sequences. Our algorithm maximizes the joint probability of the ancestral sequences, assuming that the rate is gamma distributed among sites. Our algorithm probably finds the global maximum. The use of 'joint' reconstruction is motivated by studies that use ...

متن کامل

GeneTRACE - Reconstruction of Gene Content of Ancestral Species

While current computational methods allow the reconstruction of individual ancestral protein sequences, reconstruction of complete gene content of ancestral species is not yet an established task. In this paper, we describe GENETRACE, an efficient linear-time algorithm that allows the reconstruction of evolutionary history of individual protein families as well as the complete gene content of a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001